智能论文笔记

Tensions Between the Proxies of Human Values in AI

Teresa Datta , Daniel Nissani , Max Cembalest , Akash Khanna , Haley Massa , John P. Dickerson

分类：机器学习 | 人工智能

2022-12-14

Motivated by mitigating potentially harmful impacts of technologies, the AI community has formulated and accepted mathematical definitions for certain pillars of accountability: e.g. privacy, fairness, and model transparency. Yet, we argue this is fundamentally misguided because these definitions are imperfect, siloed constructions of the human values they hope to proxy, while giving the guise that those values are sufficiently embedded in our technologies. Under popularized methods, tensions arise when practitioners attempt to achieve each pillar of fairness, privacy, and transparency in isolation or simultaneously. In this position paper, we push for redirection. We argue that the AI community needs to consider all the consequences of choosing certain formulations of these pillars -- not just the technical incompatibilities, but also the effects within the context of deployment. We point towards sociotechnical research for frameworks for the latter, but push for broader efforts into implementing these in practice.

translated by 谷歌翻译

Let Users Decide: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse

Martin Pawelczyk , Teresa Datta , Johannes van-den-Heuvel , Gjergji Kasneci , Himabindu Lakkaraju

分类：机器学习

2022-03-13

随着机器学习（ML）模型越来越多地用于做出结果决定，人们对开发可以为受影响个人提供求助的技术越来越兴趣。这些技术中的大多数提供了追索权，假设受影响的个体将实施规定的recourses \ emph {prirent}。但是，由于各种原因，要求将薪水提高\ $ 500的人可能会获得嘈杂和不一致的方式实施，这可能会获得晋升，而增加了505美元。在此激励的情况下，我们研究了面对嘈杂的人类反应时追索性无效的问题。更具体地说，我们从理论上和经验上分析了最新算法的行为，并证明这些算法产生的记录很可能是无效的（即，如果对它们做出的小变化，则可能导致负面结果）。我们进一步提出了一个新颖的框架，期望嘈杂的响应（\ texttt {Expect}），该框架通过在嘈杂的响应中明确最大程度地减少追索性无效的可能性来解决上述问题。我们的框架可以确保最多$ r \％$的最多$ r $作为最终用户请求追索权的输入。通过这样做，我们的框架为最终用户提供了更大的控制权，可以在追索性成本和稳定性之间的稳定性之间进行权衡。具有多个现实世界数据集的实验评估证明了所提出的框架的功效，并验证了我们的理论发现。

translated by 谷歌翻译

Deep Learning for Space Weather Prediction: Bridging the Gap between Heliophysics Data and Theory

John C. Dorelli , Chris Bard , Thomas Y. Chen , Daniel Da Silva , Luiz Fernando Guides dos Santos , Jack Ireland , Michael Kirk , Ryan McGranaghan , Ayris Narock , Teresa Nieves-Chinchilla

分类：机器学习

2022-12-27

Traditionally, data analysis and theory have been viewed as separate disciplines, each feeding into fundamentally different types of models. Modern deep learning technology is beginning to unify these two disciplines and will produce a new class of predictively powerful space weather models that combine the physical insights gained by data and theory. We call on NASA to invest in the research and infrastructure necessary for the heliophysics' community to take advantage of these advances.

translated by 谷歌翻译

In-Sensor & Neuromorphic Computing are all you need for Energy Efficient Computer Vision

Gourav Datta , Zeyu Liu , Md Abdullah-Al Kaiser , Souvik Kundu , Joe Mathai , Zihan Yin , Ajey P. Jacob , Akhilesh R. Jaiswal , Peter A. Beerel

分类：计算机视觉

2022-12-21

Due to the high activation sparsity and use of accumulates (AC) instead of expensive multiply-and-accumulates (MAC), neuromorphic spiking neural networks (SNNs) have emerged as a promising low-power alternative to traditional DNNs for several computer vision (CV) applications. However, most existing SNNs require multiple time steps for acceptable inference accuracy, hindering real-time deployment and increasing spiking activity and, consequently, energy consumption. Recent works proposed direct encoding that directly feeds the analog pixel values in the first layer of the SNN in order to significantly reduce the number of time steps. Although the overhead for the first layer MACs with direct encoding is negligible for deep SNNs and the CV processing is efficient using SNNs, the data transfer between the image sensors and the downstream processing costs significant bandwidth and may dominate the total energy. To mitigate this concern, we propose an in-sensor computing hardware-software co-design framework for SNNs targeting image recognition tasks. Our approach reduces the bandwidth between sensing and processing by 12-96x and the resulting total energy by 2.32x compared to traditional CV processing, with a 3.8% reduction in accuracy on ImageNet.

translated by 谷歌翻译

Hoyer regularizer is all you need for ultra low-latency spiking neural networks

Gourav Datta , Zeyu Liu , Peter A. Beerel

分类：计算机视觉

2022-12-20

Spiking Neural networks (SNN) have emerged as an attractive spatio-temporal computing paradigm for a wide range of low-power vision tasks. However, state-of-the-art (SOTA) SNN models either incur multiple time steps which hinder their deployment in real-time use cases or increase the training complexity significantly. To mitigate this concern, we present a training framework (from scratch) for one-time-step SNNs that uses a novel variant of the recently proposed Hoyer regularizer. We estimate the threshold of each SNN layer as the Hoyer extremum of a clipped version of its activation map, where the clipping threshold is trained using gradient descent with our Hoyer regularizer. This approach not only downscales the value of the trainable threshold, thereby emitting a large number of spikes for weight update with a limited number of iterations (due to only one time step) but also shifts the membrane potential values away from the threshold, thereby mitigating the effect of noise that can degrade the SNN accuracy. Our approach outperforms existing spiking, binary, and adder neural networks in terms of the accuracy-FLOPs trade-off for complex image recognition tasks. Downstream experiments on object detection also demonstrate the efficacy of our approach.

translated by 谷歌翻译

A Generalized Framework for Critical Heat Flux Detection Using Unsupervised Image-to-Image Translation

Firas Al-Hindawi , Tejaswi Soorib , Han Hu , Md Siddiquee , Hyunsoo Yoon , Teresa Wu , Ying Sun

分类：计算机视觉

2022-12-18

This work proposes a framework developed to generalize Critical Heat Flux (CHF) detection classification models using an Unsupervised Image-to-Image (UI2I) translation model. The framework enables a typical classification model that was trained and tested on boiling images from domain A to predict boiling images coming from domain B that was never seen by the classification model. This is done by using the UI2I model to transform the domain B images to look like domain A images that the classification model is familiar with. Although CNN was used as the classification model and Fixed-Point GAN (FP-GAN) was used as the UI2I model, the framework is model agnostic. Meaning, that the framework can generalize any image classification model type, making it applicable to a variety of similar applications and not limited to the boiling crisis detection problem. It also means that the more the UI2I models advance, the better the performance of the framework.

translated by 谷歌翻译

Physics-informed Neural Networks with Periodic Activation Functions for Solute Transport in Heterogeneous Porous Media

Salah A Faroughi , Pingki Datta , Seyed Kourosh Mahjour , Shirko Faroughi

分类：机器学习

2022-12-17

Solute transport in porous media is relevant to a wide range of applications in hydrogeology, geothermal energy, underground CO2 storage, and a variety of chemical engineering systems. Due to the complexity of solute transport in heterogeneous porous media, traditional solvers require high resolution meshing and are therefore expensive computationally. This study explores the application of a mesh-free method based on deep learning to accelerate the simulation of solute transport. We employ Physics-informed Neural Networks (PiNN) to solve solute transport problems in homogeneous and heterogeneous porous media governed by the advection-dispersion equation. Unlike traditional neural networks that learn from large training datasets, PiNNs only leverage the strong form mathematical models to simultaneously solve for multiple dependent or independent field variables (e.g., pressure and solute concentration fields). In this study, we construct PiNN using a periodic activation function to better represent the complex physical signals (i.e., pressure) and their derivatives (i.e., velocity). Several case studies are designed with the intention of investigating the proposed PiNN's capability to handle different degrees of complexity. A manual hyperparameter tuning method is used to find the best PiNN architecture for each test case. Point-wise error and mean square error (MSE) measures are employed to assess the performance of PiNNs' predictions against the ground truth solutions obtained analytically or numerically using the finite element method. Our findings show that the predictions of PiNN are in good agreement with the ground truth solutions while reducing computational complexity and cost by, at least, three orders of magnitude.

translated by 谷歌翻译

Task Discovery: Finding the Tasks that Neural Networks Generalize on

Andrei Atanov , Andrei Filatov , Teresa Yeo , Ajay Sohmshetty , Amir Zamir

分类：机器学习

2022-12-01

When developing deep learning models, we usually decide what task we want to solve then search for a model that generalizes well on the task. An intriguing question would be: what if, instead of fixing the task and searching in the model space, we fix the model and search in the task space? Can we find tasks that the model generalizes on? How do they look, or do they indicate anything? These are the questions we address in this paper. We propose a task discovery framework that automatically finds examples of such tasks via optimizing a generalization-based quantity called agreement score. We demonstrate that one set of images can give rise to many tasks on which neural networks generalize well. These tasks are a reflection of the inductive biases of the learning framework and the statistical patterns present in the data, thus they can make a useful tool for analysing the neural networks and their biases. As an example, we show that the discovered tasks can be used to automatically create adversarial train-test splits which make a model fail at test time, without changing the pixels or labels, but by only selecting how the datapoints should be split between the train and test sets. We end with a discussion on human-interpretability of the discovered tasks.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

MP-SeizNet: A Multi-Path CNN Bi-LSTM Network for Seizure-Type Classification Using EEG

Hezam Albaqami , Ghulam Mubashar Hassan , Amitava Datta

分类：机器学习

2022-11-09

Seizure type identification is essential for the treatment and management of epileptic patients. However, it is a difficult process known to be time consuming and labor intensive. Automated diagnosis systems, with the advancement of machine learning algorithms, have the potential to accelerate the classification process, alert patients, and support physicians in making quick and accurate decisions. In this paper, we present a novel multi-path seizure-type classification deep learning network (MP-SeizNet), consisting of a convolutional neural network (CNN) and a bidirectional long short-term memory neural network (Bi-LSTM) with an attention mechanism. The objective of this study was to classify specific types of seizures, including complex partial, simple partial, absence, tonic, and tonic-clonic seizures, using only electroencephalogram (EEG) data. The EEG data is fed to our proposed model in two different representations. The CNN was fed with wavelet-based features extracted from the EEG signals, while the Bi-LSTM was fed with raw EEG signals to let our MP-SeizNet jointly learns from different representations of seizure data for more accurate information learning. The proposed MP-SeizNet was evaluated using the largest available EEG epilepsy database, the Temple University Hospital EEG Seizure Corpus, TUSZ v1.5.2. We evaluated our proposed model across different patient data using three-fold cross-validation and across seizure data using five-fold cross-validation, achieving F1 scores of 87.6% and 98.1%, respectively.

translated by 谷歌翻译